80 ◾ Bioinformatics
samtools view \
-@ 4 \
-uS -o SRR769545_mem.bam SRR769545_mem.sam
The “-@” option specifies the number of threads, the “-u” option is to produce uncom-
pressed BAM output, “-S” is to ignore auto-detection of the input format, and “-o” specifies
the output file.
If you have already run the above command and the old SAM file is still there, you can
delete it with “rm SRR769545_mem.sam” command and then run the following to convert
the BAM file back to a SAM file:
samtools view \
-@ 4 -h \
-o SRR769545_mem.sam \
SRR769545_mem.bam
The “-h” option is to include header in SAM output.
2.4.1.2 Sorting Alignments
Sorted alignments are required for some applications such as variant calling and RNA-Seq.
Samtools can sort the alignments in a BAM file by coordinate order, by read name, or by
a TAG. By default, “samtools sort” will sort the alignment by the coordinate order, unless
“-n” or “-t TAG” option is used. If “-n” option is used, the command will sort the align-
ments by read name. If the “-t TAG” option is used, then the alignments will be sorted by
tag. Refer to SAM and BAM file formats discussed above to learn about the standard TAGs.
In the following, the BAM file will be sorted by coordinate order:
samtools sort \
-@ 4 \
-T mem.tmp.sort \
-o SRR769545_mem_sorted.bam \
SRR769545_mem.bam
This will create a new BAM file with sorted alignments. You can delete the unsorted BAM
file if you need to save some storage space.
2.4.1.3 Indexing BAM File
Before using a BAM file in any of the downstream analysis, it can be indexed to allow fast
random access to the alignments in the file. Thus, the alignment information will be pro-
cessed faster. We can use the “samtools index” command to index the above sorted BAM
file as follows:
samtools index SRR769545_mem_sorted.bam
This will generate the BAM index file “SRR769545_mem_sorted.bam.
bai”.